Part-of-speech tagging and chunk parsing of spoken Dutch using support vector machines
نویسنده
چکیده
This paper describes the design and evaluation of a part-ofspeech tagger and chunk parser for spoken Dutch, using support vector machines. The data in the Corpus Gesproken Nederlands is split into smaller sub problems to obtain reasonable training and tagging speed using various kernel types. The tagger combines good accuracy with reasonable tagging speed. The chunk parser shows good accuracy, but suffers from low speed.
منابع مشابه
An improved joint model: POS tagging and dependency parsing
Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...
متن کاملA Fast Boosting-based Learner for Feature-Rich Tagging and Chunking
Combination of features contributes to a significant improvement in accuracy on tasks such as part-of-speech (POS) tagging and text chunking, compared with using atomic features. However, selecting combination of features on learning with large-scale and feature-rich training data requires long training time. We propose a fast boosting-based algorithm for learning rules represented by combinati...
متن کاملImproved Arabic Base Phrase Chunking with a new enriched POS tag set
Base Phrase Chunking (BPC) or shallow syntactic parsing is proving to be a task of interest to many natural language processing applications. In this paper, A BPC system is introduced that improves over state of the art performance in BPC using a new part of speech tag (POS) set. The new POS tag set, ERTS, reflects some of the morphological features specific to Modern Standard Arabic. ERTS expl...
متن کاملTarget Word Detection and Semantic Role Chunking using Support Vector Machines
In this paper, the automatic labeling of semantic roles in a sentence is considered as a chunking task. We define a semantic chunk as the sequence of words that fills a semantic role defined in a semantic frame. It is straightforward to convert chunking into a tagging task using one of several IOB representations. Using this representation each word is tagged with I, which means that the word i...
متن کاملA Memory-Based Shallow Parser for Spoken Dutch
We describe the development of a Dutch memory-based shallow parser. The availability of large treebanks for Dutch, such as the one provided by the Spoken Dutch Corpus, allows memory-based learners to be trained on examples of shallow parsing taken from the treebank, and act as a shallow parser after training. An overview is given of a modular memory-based learning approach to shallow parsing, c...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006